The k-Anonymity Problem Is Hard

نویسندگان

Paola Bonizzoni

Gianluca Della Vedova

Riccardo Dondi

چکیده

The problem of publishing personal data without giving up privacy is becoming increasingly important. An interesting formalization recently proposed is the k-anonymity. This approach requires that the rows in a table are clustered in sets of size at least k and that all the rows in a cluster become the same tuple, after the suppression of some records. The natural optimization problem, where the goal is to minimize the number of suppressed entries, is known to be NP-hard when the values are over a ternary alphabet, k = 3 and the rows length is unbounded. In this paper we give a lower bound on the approximation factor that any polynomial-time algorithm can achive on two restrictions of the problem, namely (i) when the records values are over a binary alphabet and k = 3, and (ii) when the records have length at most 8 and k = 4, showing that these restrictions of the problem are APX-hard.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Univariate Microaggregation for Integer Values

Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...

متن کامل

Resolving the Complexity of Some Data Privacy Problems

We formally study two methods for data sanitation that have been used extensively in the database community: k-anonymity and l-diversity. We settle several open problems concerning the difficulty of applying these methods optimally, proving both positive and negative results: – 2-anonymity is in P. – The problem of partitioning the edges of a triangle-free graph into 4-stars (degree-three verti...

متن کامل

Parameterized Complexity of k-Anonymity: Hardness and Tractability

The problem of publishing personal data without giving up privacy is becoming increasingly important. A clean formalization that has been recently proposed is the k-anonymity, where the rows of a table are partitioned in clusters of size at least k and all rows in a cluster become the same tuple, after the suppression of some entries. The natural optimization problem, where the goal is to minim...

متن کامل

Pattern-Guided k-Anonymity

We suggest a user-oriented approach to combinatorial data anonymization. A data matrix is called k-anonymous if every row appears at least k times—the goal of the NP-hard k-ANONYMITY problem then is to make a given matrix k-anonymous by suppressing (blanking out) as few entries as possible. Building on previous work and coping with corresponding deficiencies, we describe an enhanced k-anonymiza...

متن کامل

The Effect of Homogeneity on the Complexity of k-Anonymity

The NP-hard k-Anonymity problem asks, given an n×mmatrix M over a fixed alphabet and an integer s > 0, whether M can be made k-anonymous by suppressing (blanking out) at most s entries. A matrix M is said to be k-anonymous if for each row r in M there are at least k − 1 other rows in M which are identical to r. Complementing previous work, we introduce two new “data-driven” parameterizations fo...

متن کامل

Checking for k-Anonymity Violation by Views

When a private relational table is published using views, secrecy or privacy may be violated. This paper uses a formally-defined notion of k-anonymity to measure disclosure by views, where k>1 is a positive integer. Intuitively, violation of k-anonymity occurs when a particular attribute value of an entity can be determined to be among less than k possibilities by using the views together with ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

The k-Anonymity Problem Is Hard

نویسندگان

چکیده

منابع مشابه

Improved Univariate Microaggregation for Integer Values

Resolving the Complexity of Some Data Privacy Problems

Parameterized Complexity of k-Anonymity: Hardness and Tractability

Pattern-Guided k-Anonymity

The Effect of Homogeneity on the Complexity of k-Anonymity

Checking for k-Anonymity Violation by Views

عنوان ژورنال:

اشتراک گذاری